Host memory interface for a parallel processor

ABSTRACT

A memory interface for a parallel processor which has an array of processing elements and can receive a memory address and supply the memory address to a memory connected to the processing elements. The processing elements transfer data to and from the memory at the memory address. The memory interface can connect to a host configured to access data in a conventional SDRAM memory device so that the host can access data in the memory.

FIELD OF THE INVENTION

[0001] The present invention relates to accessing data in a parallelprocessor including a memory array. Preferred embodiments of the presentinvention relate to accessing of data stored in memory connected to anarray of processing elements in an active memory device by a hostconfigured for connection with a conventional memory device.

BACKGROUND TO THE INVENTION

[0002] A simple computer generally includes a central processing unitCPU and a main memory. The CPU implements a sequence of operationsencoded in a stored program. The program and data on which the CPU actsis typically stored in the main memory. The processing of the programand the allocation of main memory and other resources are controlled byan operating system. In operating systems where multiple applicationsmay share and partition resources, the computer's processing performancecan be improved through use of active memory.

[0003] Active memory is memory that processes data as well as storingit. It can be instructed to operate on its contents without transferringits contents to the CPU or to any other part of the system. This istypically achieved by distributing parallel processors throughout thememory. Each parallel processor is connected to the memory and operateson the memory independently of the other processing elements. Most ofthe data processing is performed within the active memory and the workof the CPU is thus reduced to the operating system tasks of schedulingprocesses and allocating system resources.

[0004] A block of active memory typically consists of the following: ablock of memory, e.g. dynamic random access memory DRAM, aninterconnection block and a memory processor processing element array.The interconnection block provides a path that allows data to flowbetween the block of memory and the processing element array. Theprocessing element array typically includes multiple identicalprocessing elements controlled by a sequencer. Processing elements aregenerally small in area, have a low degree of hardware complexity, andare quick to implement, which leads to increased optimisation.Processing elements are usually designed to balance performance andcost. A simple more general-purpose processing element will result in ahigher level of performance than a more complex processing elementbecause it can be easily coupled to many identical processing elements.Further, because of its simplicity, the processing element will clock ata faster rate.

[0005] In any computer system, it is important that data can be madeavailable to the processor as quickly as possible. In an active memorydevice, the complexity of the device means that data has to be accessedfrom the memory via the processing elements. Thus, the speed of accessto the memory by a host processor is reduced. In addition, the addedcomplexity that an active memory device bestows on a computer systemmeans that additional complexity is added to the method of accessingdata from the active memory device, which itself imparts additionalcomplexity on the host processor.

[0006] In current systems, due to this additional complexity, a hostconnected to an active memory device has to be custom designedspecifically for the active memory device. Thus, hosts configured forconnection with one type of active memory device cannot be used with adifferent type of active memory device. Furthermore, hosts which havebeen designed for connection with conventional memory devices, such asstandard SDRAM memories, cannot be connected to active memory devices atall. As such, considerable expense is incurred in the development ofcomputer systems using active memory devices, since not only does theactive memory device have to be designed and built, but also a completehost system to operate with it. Conventional memory devices are definedas any type of non-active memory devices which can be addressed byconventional memory command signals conforming to common industrystandards.

[0007] Accordingly, it is an object of the present invention to providea standard memory interface for an active memory device which permitsdifferent types of host processors to access the memory in the device.

[0008] It is a further object of the present invention to provide amemory interface for an active memory device for use with conventionalhost processors which are configured to connect to standard “non-active”memory devices, such as a standard SDRAM memory module.

SUMMARY OF THE INVENTION

[0009] In view of the foregoing and in accordance with one aspect of thepresent invention, there is provided a memory interface for a parallelprocessor having an array of processing elements, the memory interfacebeing adapted to operate as follows:

[0010] to receive memory control signals and memory addresses from ahost;

[0011] to apply at least a portion of the memory addresses to a memoryconnected to the processing elements; and

[0012] to apply control signals to the processing elements, such that inresponse the processing elements transfer data:

[0013] to and from the memory at the memory address; or

[0014] to and from the host; or

[0015] both; and

[0016] wherein the memory interface is adapted to connect to a hostconfigured to access data in a conventional memory device, such that thehost can access data in the memory.

[0017] The memory control signals and memory addresses may include a rowaddress signal RAS, a row address, a column address signal CAS, a columnaddress and a write enable signal WE.

[0018] The present invention further provides a memory interface for aparallel processor having an array of processing elements, the memoryinterface being adapted to operate as follows:

[0019] to receive memory control signals and memory addresses from ahost;

[0020] to apply at least a portion of the memory addresses to a memoryconnected to the processing elements; and

[0021] to apply control signals to the processing elements, such that inresponse the processing elements transfer data:

[0022] to and from the memory at the memory address; or

[0023] to and from the host; or

[0024] both; and

[0025] wherein the memory control signals and memory addresses include arow address signal RAS, a row address, a column address signal CAS, acolumn address and a write enable signal WE.

[0026] Preferably, on receipt of a row address and a first configurationof memory control signals including a RAS assertion, the interfaceactivates a page of data by transferring data from the row in the memorycorresponding to the row address, into the processing elements.

[0027] Preferably, on receipt of a second configuration of memorycommand signals, the interface deactivates the page of data bytransferring data from the processing elements into the row in thememory corresponding to the row address.

[0028] Preferably, on receipt of a column address and a thirdconfiguration of memory command signals including a CAS assertion and aWE deassertion, the interface transfers data from the activated page ofdata in the processing elements to the host, beginning with data fromthe column in the memory corresponding to the column address.

[0029] Preferably, on receipt of a column address and a fourthconfiguration of memory command signals including a CAS assertion and aWE assertion, the interface transfers data from the host to theactivated page of data in the processing elements, beginning with datafor the column in the memory corresponding to the column address.

[0030] The present invention further provides an active memorycomprising a memory, an array of processing elements connected to thememory and a memory interface and methods of reading and writing to sucha memory.

BRIEF DESCRIPTION OF THE DRAWINGS

[0031] A specific embodiment will now be described by way of exampleonly and with reference to the accompanying drawings, in which:

[0032]FIG. 1 shows one embodiment of an active memory block inaccordance with the present invention;

[0033]FIG. 2 shows one embodiment of the components of the active memoryblock in accordance with the present invention;

[0034]FIG. 3 shows one embodiment of control logic in the memoryinterface;

[0035]FIG. 4 shows one embodiment of a processing element in the activememory block in accordance with the present invention;

[0036]FIGS. 5a and 5 b show representations of the array of processingelements in accordance with the present invention;

[0037]FIGS. 6a to 6 c show different array address mappings inaccordance with the present invention;

[0038]FIGS. 6d to 6 e show different mappings of bytes within a 32-bitword stored in host registers in the processing elements in accordancewith the present invention;

[0039]FIG. 7 shows a state diagram for a finite state machine in thecontrol logic in accordance with the present invention.

[0040] FIGS. 8-15 are timing diagrams showing the operation of variousmemory commands.

DETAILED DESCRIPTION

[0041] Referring to FIG. 1, one embodiment of an active memory block inaccordance with the invention is shown. Active memory block 100 includesa memory 106 and an PE array 110 of processing elements (PEs). Memory106 is preferably random access memory (RAM), in particular DRAM. The PEarray 110 communicates with memory 106 via an interconnection block 108.The interconnection block 108 can be any suitable communications path,such as a bidirectional high bandwidth path. A host 102, which in thiscase is a central processing unit CPU, communicates with the PE array110 via memory interface 112. The memory interface 112 furthercommunicates with the memory 106 via a DRAM control unit DCU 114. Thememory interface includes conventional address, data and control lines.

[0042] Referring to FIG. 2, the active memory block 100 is shownconnected to the host 102. The active memory block 100 comprises thememory 106, an array 110 of processing elements and the memory interface112 having control logic 204 and a data register 206. The data register206 is connected to the host 102 by a first data path 208 which isadapted to transfer high bandwidth data between the host 102 and thedata register 206. The host 102 supplies a memory address 210 in theconventional way, using row (MSBs) and column (LSBs) addresses and RASand CAS assertions, and other conventional memory access command signals212 to the control logic 204. A READY signal 222 is generated by thecontrol logic 204 and sent back to the host 102 to indicate that furthercommand signals 212 can be sent.

[0043] The control logic 204 interprets the conventional memory accesscommand signals 212 and the memory address 210 and generates an arrayaddress 214 from the column address of the memory address 210 and arraycontrol signals 216 which are sent to the PE array 110 and memorycontrol signals 218 which are sent to the memory 106 via the DCU 114.The processing elements in the PE array 110 are configured to receive orsend a row of data from or to the row in the memory 106 corresponding tothe row address (MSBs) of the memory address 210. The PE array 110 isconfigured to respond to the array control signals 216 and the arrayaddress 214 to transfer data from the processing elements addressed bythe array address 214. The data is transferred between the memory 106and the PE array 110 via the interconnection block 108 and between thehost 102 and the PE array 110 via the first and second data paths 208,220 which are linked across the data register 206.

[0044] The control logic 204 also receives a page command signal 224from the host 102 to determine which of two pages of data in the PEarray 110 to address. The selection of the page is made via the arraycontrol signals 216.

[0045] Referring to FIG. 3, the control logic 204 is shown including anaddress register 302 for receiving the memory address 210 from the host102, a mode register 304 for generating mode signals 312. A finite statemachine FSM 306 receives the command signals 212 from the host 102 andthe mode signals 312 from the mode register 304 and generates the memorycontrol signals 218 and array control signals 216. Address transformlogic 308 generates an array address 214 from the column address (LSBs)of the memory address 210 and sends it to the PE array 110, to addressthe appropriate processing elements in the PE array 110 corresponding tothe array address 214 and the mapping of the addresses to the processingelements, as specified by the mode signals 312.

[0046] The contents of a mode register 304 is used to determine the dataordering in the PE array 110 and the memory 106 and sends mode signals312 to the address transform logic 308 and the DCU 114 so that theaddress transform logic 308 can interpret and address the data in the PEarray 110 correctly and the DCU 114 can address the data in the memory106.

[0047] Referring to FIG. 4, a processing element 400 in the PE array 110is shown comprising a DRAM interface 401 for connecting the memory 106and the memory interface 112 with the processing element 400. Alsoincluded in the processing element 400 is a register file 406 betweenthe result pipe 408 and processing logic 410. Data from the memory 106is sent via the DRAM interface 401 to be processed in the processinglogic 410 and moved between other processing elements in the PE array110 via the result pipe 408. The DRAM interface 401 comprises hostregisters (H-registers) 402 and DRAM registers 404. The H-registers 402receive from and send data to the memory interface 112 via the seconddata path 220.

[0048] The H-registers 402 are arranged in a first bank 451 and a secondbank 452, each bank corresponding respectively to a first and secondpage of data to be stored in the H-registers 402 of all of theprocessing elements. The page to be addressed is determined by the pagecommand signal 224 which is interpreted by the FSM 306 and sent to thePE array 110 with the array control signals 216. Thus, at any giventime, two pages of data can be active in the PE array 110.

[0049] Every command issued to the interface, by a host processor orexternal I/O device is accompanied by a page select. The interfacemaintains a complete set of operational parameters for each page (forexample the DRAM address used by the ACTIVE command). A page consists offour planes of DRAM bytes in the H-registers in each PE, or 1024 bytes.The data in the first plane is taken from the DRAM data at the page orrow address supplied with the ACTIVE command described below. Once apage is held in the H-registers 402, burst reads and writes can takeplace as described below. The interface data input and output ports are32 bits wide, and so the unit of data transfer during bursts is the 32bit word. Each page contains 256 32 bit words, which are addressed witheight address bits. The mapping mode, described below, determines theway that each eight bit address maps to the bytes within the Hregisters.

[0050] The DRAM registers 404 receive data from and send data to thememory 102 at the row corresponding to the row address (MSBs) of thememory address 120 via the interconnection block 108. The data isreceived from the DRAM registers 404 and transferred between the memoryinterface 112 via one of the banks of H-registers 402, the bank beingspecified by the array command signals 212. Each H-register can storeone byte (8 bits) of data. Thus, a given processing element 400 canstore a 32 bit word for each of the two pages.

[0051] Referring to FIGS. 5a, 5 b and 6 a to 6 c, a representation ofthe PE array 110 is shown having individual processing elements 400. InFIG. 5b, the first page 500 of data is shown with the H-registers 402 inthe first bank 451 represented by four layers 501, 502, 503, 504 ofH-registers 402. The second page of data is not shown, but in a similarway to the first page 500 uses four H-registers 402 in the second bank452 and operates in a similar manner to the first page 500 as discussedbelow.

[0052] For the first page 500, each layer 501, 502, 503, 504 ofH-registers corresponds to first, second, third or fourth H-registers ineach processing element 400. For the PE array 110 shown in FIG. 5b,which has 16 rows and 16 columns, there are 256 processing elements and1024 bytes of data in the first page 500.

[0053]FIGS. 6a to 6 c show different mappings of data in the PE array110, the type of mapping being set or interpreted by the mode signals312. The second data path 220 is 32 bits wide, so the corresponding unitof data transfer from the H-registers 402 to the data register 206 is a32 bit word. There are 256 processing elements in the PE array 110 andtherefore 256 32 bit words which are addressed by an array address 214which is 8 bits wide.

[0054] In FIG. 6a, 32 bits of data are contained in each processingelement 601, with 8 bits of data held in each of the four H-registers402 in each processing element. This is referred to as ‘word’ mappingand is used for 32 bit processing element operations. Each array addresscorresponds to an entire processing element.

[0055] In FIG. 6b, 2×16 bits of data are contained in each processingelement 601, 602, with 32 bits of data in total held across twoH-registers 402 in each of two processing elements 601, 602. This isreferred to as ‘half-word’ mapping and is used for 16 bit processingelement operations. Thus, for each processing element, there are twomapped array addresses, with each array address corresponding to twodifferent H-registers.

[0056] In FIG. 6c, 4×8 bits of data are contained in each processingelement 601, 602, 603, 604, with 32 bits of data held across a singleH-register 402 in each of four processing elements 601, 602, 603, 604.This is referred to as ‘byte’ mapping and is used for 8 bit processingelement operations. Thus, for each processing element, there are fourmapped array addresses, with each array address corresponding to adifferent H-register.

[0057] In addition to the aforementioned mappings of data in the PEarray 110, the endianism of the data can be set by the host 102, i.e.the ordering of the bytes in each 32 bit word stored in the H-registers402. There are two different orderings of bytes: big endian and littleendian. Routines in the processing elements expect multi-byte words tobe stored in the register file in a particular way and by convention bigendian is the normal mode which means that the most significant byte ofa multi-byte number is held in the lowest addressed register.

[0058] Big endian mode 670 is shown in FIG. 6d, which shows a lowestaddressed register 671 containing a most significant byte 672 of a32-bit word and a highest addressed register 673 containing a leastsignificant byte 674. Little endian mode 680 is shown in FIG. 6e, whichshows the lowest addressed register 671 containing the least significantbyte 672 of a 32-bit word and the highest addressed register 673containing the most significant byte 674.

[0059] The mapping and endian modes are specified by the host issuing aLOAD command (see below) and placing mode register fields (see Table 1below) onto the memory address lines. The mode register fields arestored in the mode register 304 which sends the mode signals 312 to theaddress transform logic 308 so that the address transform logic caninterpret the data in the PE array 110 appropriately. TABLE 1 Moderegister fields Bits Field Comments 0 to 1 Mapping 0: word mapping 1:half-word mapping 2, 3: byte mapping 2 Endianism 0: big-endian bytemapping 1: little-endian byte mapping

[0060] Referring to FIG. 7, a state diagram for the finite state machineFSM 306 is shown. As mentioned above, the FSM 306 receives conventionalmemory access command signals 212 from the host 101. The conventionalmemory access commands, which are interpreted by and implemented in theFSM 306 and shown in FIG. 7, are listed in Table 2 below. TABLE 2Command Functions and Encoding Command value RAS CAS WE State 7 1 1 1NOP 760 6 1 1 0 Burst Terminate 764 5 1 0 1 Read 756 4 1 0 0 Write 758 30 1 1 Active 754 2 0 1 1 Deactivate 752

[0061] In Table 2, the command signals 212 sent by the host 101 are theconventional memory access signals: RAS (Row Address Signal); CAS(Column Access Signal); and WE (Write Enable), which are interpreted bythe FSM 306 as the states listed in Table 2 and shown in FIG. 7.

[0062] As can be seen from FIG. 7, the FSM 306 will remain in an idlestate 702 and an active state 704 indefinitely until a command is issuedby the host 101.

[0063] From the idle state 702, before data can be accessed, a page mustbe activated using the ACTIVE command 754 (see Table 1) to enter theactive state 704 in which a page of 256 32-bit values has been activatedin the H-registers 402 for reading and writing by the host 102.Activation consists of loading data from the memory 106 into theH-registers 402 of the processing elements according to the mappingscheme currently in force. The ACTIVE command 754 can take a variableamount of time, so a READY signal 222 signals to the host 102 that theACTIVE command 754 has completed and the active state 704 has beenentered. After an ACTIVE command 754 has been issued by the host 102,the command inputs will be ignored until after the READY signal 222 goeshigh indicating completion of the ACTIVE command 754. Once a page hasbeen activated it remains active until a DEACTIVATE or PRECHARGE commandis registered for that page.

[0064]FIG. 8 is a timing diagram illustrating the operation of theACTIVE command. In FIGS. 8-15. The various signals shown have thefollowing significance. TABLE 3 Signal Descriptions Signal In/OutDescription m_clk Out Memory Port Timing Reference Clock. m_clk runs attwice the frequency of the master clock clk_in. Memory port transactionsare timed relative to the rising edge of m_clk. m_d[32] In/Out Memoryinterface data. m_a[12] In Memory interface address. m_cmd[3] In Memoryinterface command. m_page In Memory interface page select: selects whichpage of H registers is activated by the current command. m_ce In Memoryinterface enable: transaction only takes place when m_ce is active. m_oeIn Memory interface output enable: when (1), chip drives m_d out. When(0) m_d is high impedance. m_rdy Out Memory interface ready: indicatescompletion of ACTIVE or DEACTIVATE command. A command should only beissued when m_rdy is high. After an ACTIVE or DEACTIVATE command isregistered, no other commands are registered until the first clock edgeafter m_rdy goes high signalling completion.

[0065] In addition, the timing parameters used in FIGS. 8-15 have thefollowing significance. TABLE 4 Timing Parameters Timing Description Min(ns) Max (ns) t_(m) _(—) _(CS) Command setup to clock 2.0 t_(m) _(—)_(CH) Command hold after clock 0.0 t_(m) _(—) _(AS) Address setup toclock 2.0 t_(m) _(—) _(AH) Address hold after clock 0.0 t_(m) _(—)_(DIS) Data in setup to clock 2.0 t_(m) _(—) _(DIH) Data in hold afterclock 0.0 t_(m) _(—) _(DOV) Data output, clock to data valid 3.0 6.0t_(m) _(—) _(DHZ) Data output, m_oe to high Z 3.0 t_(m) _(—) _(DLZ) Dataoutput, m_oe to low Z 1.0 4.5 t_(m) _(—) _(RV) m_rdy, clock to valid 3.06.0 t_(m) _(—) _(SKEW) m_clk skew vs. clk_in 0 t_(m) _(—) _(CLK) Clockperiod 15

[0066] From the active state 704, upon receipt of the READ command 756(see Table 1), the FSM 306 enters a read state 706 in which data istransferred in a burst from the H-registers 402 along the second datapath 220 to the data register 206 and from there to the host 102 alongthe first data path 120. Read accesses to the DRAM are burst-orientated,up to a maximum burst length of 256 32 bit words (a whole page). Thefirst READ or WRITE command, described below, can be registered on theclock edge following the READY signal going high. The array address forbeginning the read burst is taken from bits 7 to 0 (LSBs) of the memoryaddress 210, corresponding to the column address received with the CASassertion. If a read burst runs off the end of the page, then it wrapsaround back to the start of the page and continues automatically. Burstsmay be any length, but if a burst continues for longer than a page ofH-registers, namely 256 transfers, the data will be repeated.

[0067]FIG. 10 is a timing diagram illustrating the operation of a singleburst READ command and FIG. 11 is a timing diagram illustrating theoperation of the consecutive READ commands, illustrating the terminationof prior READ bursts by subsequent READ commands.

[0068] From the active state 704, upon receipt of the WRITE command 758(see Table 1), the FSM 306 enters a write state 704 in which data istransferred in a burst from the host 102 to the data register 206 alongthe first data path 120 and from the data register 206 to theH-registers 402 along the second data path 220. Write accesses to theDRAM are burst-orientated, up to a maximum burst length of 256 32 bitwords (a whole page). The array address 214 for beginning the writeburst is taken from bits 7 to 0 (LSBs) of the memory address 210,corresponding to the column address received with the CAS assertion. Ifa write burst runs off the end of the page, then it wraps around back tothe start of the page and continues automatically. Bursts may be anylength, but if a burst continues for longer than a page of H-registers,namely 256 transfers, the written locations will be repeated andoverwritten.

[0069]FIG. 12 is a timing diagram illustrating the operation of a singleburst WRITE command and FIG. 13 is a timing diagram illustrating theoperation of the consecutive WRITE commands, illustrating thetermination of prior WRITE bursts by subsequent WRITE commands.

[0070] READ and WRITE commands may be interleaved as illustrated in thetiming diagram of FIG. 14. NOP commands may be inserted betweenconsecutive READ commands or WRITE commands or interleaved READ andWRITE commands as illustrated in the timing diagram of FIG. 15, where asingle NOP is inserted between the third and fourth WRITE commands toobtain a WRITE burst of 2 32-bit words. In FIG. 15, consecutive WRITEcommands are shown addresses to alternate pages by toggling of them_page signal. A burst to one page is terminated by any command to theother page.

[0071] A burst terminate command 764 (see Table 2) may be issued by thehost 102 to terminate a data read or write burst and return the FSM 306to the active state 704.

[0072] From the active, read or write states 702, 704 or 706, uponreceipt of the DEACTIVATE or PRECHARGE command 752 (see Table 2), a pagein the H-registers 402 is deactivated and its contents are returned tothe memory 106 at the row corresponding to the row address part of thememory address 210 via the DRAM registers 404. The ACTIVE command cantake a variable amount of time. Again, the READY signal is used tosignal to the host that the DEACTIVATE or PRECHARGE command hascompleted. Thus, after a DEACTIVATE or PRECHARGE command 752 has beenissued by the host 102, the command inputs will be ignored until after aREADY signal 222 is asserted indicating completion of the DEACTIVATE orPRECHARGE command 752. If a page is activated by issuance of an ACTIVEcommand 754 and then no WRITE command 758 is issued, since no data hasbeen written into the PE array 110 by the memory interface 112, theDEACTIVATE or PRECHARGE command 752 terminates immediately taking noaction and asserting the READY signal 222.

[0073]FIG. 9 is a timing diagram illustrating the operation of theDEACTIVATE command.

[0074] The NOP command 760 see Table 2 is used to prevent unwantedcommands from being registered during the idle, active, read or writestates. Operations that are already in progress are not affected byissuance of the NOP command 760 by the host 102.

[0075] The LOAD command 762 (see Table 2) is a single-cycle command thatcan be issued at any time, except during activation and deactivation.Issuance of a LOAD command 762 by the host 102 will immediatelyterminate any read or write burst that is currently taking place. TheLOAD command 762 causes the mode fields placed into the memory addresslines by the host 101 to be loaded into the mode register 304.

[0076] It will of course be understood that the present invention hasbeen described above purely by way of example and modifications ofdetail can be made within the scope of the invention.

1. A memory interface for a parallel processor having an array ofprocessing elements, the memory interface being adapted to operate asfollows: to receive memory control signals and memory addresses from ahost; to apply at least a portion of the memory addresses to a memoryconnected to the processing elements, and to apply control signals tothe processing elements, such that in response the processing elementstransfer data: to and from the memory at the memory address; or to andfrom the host; or both; and wherein the memory interface is adapted toconnect to a host configured to access data in a conventional memorydevice, such that the host can access data in the memory.
 2. A memoryinterface circuit according to claim 1 further comprising a dataregister via which the processing elements transfer data to and from thehost.
 3. A memory interface according to claim 1 or claim 2 wherein thememory control signals and memory addresses include a row address signalRAS, a row address, a column address signal CAS, a column address and awrite enable signal WE.
 4. A memory interface for a parallel processorhaving an array of processing elements, the memory interface beingadapted to operate as follows: to receive memory control signals andmemory addresses from a host; to apply at least a portion of the memoryaddresses to a memory connected to the processing elements; and to applycontrol signals to the processing elements, such that in response theprocessing elements transfer data: to and from the memory at the memoryaddress; or to and from the host; or both; and wherein the memorycontrol signals and memory addresses include a row address signal RAS, arow address, a column address signal CAS, a column address and a writeenable signal WE.
 5. A memory interface according to claim 3 or claim 4,adapted, on receipt of a row address and a first configuration of memorycontrol signals including a RAS assertion, to activate a page of data bytransferring data from the row in the memory corresponding to the rowaddress, into the processing elements.
 6. A memory interface accordingto claim 5, adapted, on receipt of a second configuration of memorycommand signals to deactivate the page of data by transferring data fromthe processing elements into the row in the memory corresponding to therow address.
 7. A memory interface according to claim 5 or claim 6,adapted, on receipt of the first configuration of memory commandsignals, to apply the row address to the memory, to enable the page ofdata to be transferring from the memory into the processing elements. 8.A memory interface according to any one of claims 4-7, adapted, onreceipt of a column address and a third configuration of memory commandsignals including a CAS assertion and a WE deassertion, to transfer datafrom the activated page of data in the processing elements to the host,beginning with data from the column in the memory corresponding to thecolumn address.
 9. A memory interface according to any one of claims5-8, adapted, on receipt of a column address and a fourth configurationof memory command signals including a CAS assertion and a WE assertion,to transfer data from the host to the activated page of data in theprocessing elements, beginning with data for the column in the memorycorresponding to the column address.
 10. A memory interface according toclaim 8 or claim 9 adapted, on receipt of the third or fourthconfiguration of memory control signals, to apply the column address tothe processing elements, to identify the corresponding column in thememory in respect of which the first data is to be transferred betweenthe host and the activated page of data in the processing elements. 11.A memory interface according to any one of claims 5-10, adapted, onreceipt of a further row address and the first configuration of memorycontrol signals including a RAS assertion, to activate a second page ofdata by transferring data from the row in the memory corresponding tothe further row address, into the processing elements.
 12. An activememory comprising: a memory; an array of processing elements connectedto the memory; and a memory interface according to any preceding claim.13. An active memory according to claim 12 further comprising one ormore page registers in each processing element from or into which datais transferred to and from the memory and the host.
 14. An active memoryaccording to claim 13, wherein data from each memory address istransferred to and from the page registers in a single processingelement.
 15. An active memory according to claim 13, wherein data fromdifferent memory addresses is transferred to and from corresponding pageregisters in a plurality of processing elements.
 16. A method of readingdata from an active memory including a memory and an array of processingelements connected to the memory, comprising: activating a page of databy transferring data from a row in the memory into the processingelements; and reading data from the activated page of data in theprocessing elements and outputting it to a host.
 17. A method of writingdata to an active memory including a memory and an array of processingelements connected to the memory, comprising: activating a page of databy transferring data from a row in the memory into the processingelements; and inputting data from a host and writing it to the activatedpage of data in the processing elements.
 18. A method according to claim16 or claim 17 further comprising: deactivating the page of data bytransferring data from the processing elements into the row in thememory corresponding to the row address.